131 research outputs found

    Partition Decoupling for Multi-gene Analysis of Gene Expression Profiling Data

    Get PDF
    We present the extention and application of a new unsupervised statistical learning technique--the Partition Decoupling Method--to gene expression data. Because it has the ability to reveal non-linear and non-convex geometries present in the data, the PDM is an improvement over typical gene expression analysis algorithms, permitting a multi-gene analysis that can reveal phenotypic differences even when the individual genes do not exhibit differential expression. Here, we apply the PDM to publicly-available gene expression data sets, and demonstrate that we are able to identify cell types and treatments with higher accuracy than is obtained through other approaches. By applying it in a pathway-by-pathway fashion, we demonstrate how the PDM may be used to find sets of mechanistically-related genes that discriminate phenotypes.Comment: Revise

    Pathways of Distinction Analysis: A New Technique for Multi–SNP Analysis of GWAS Data

    Get PDF
    Genome-wide association studies (GWAS) have become increasingly common due to advances in technology and have permitted the identification of differences in single nucleotide polymorphism (SNP) alleles that are associated with diseases. However, while typical GWAS analysis techniques treat markers individually, complex diseases (cancers, diabetes, and Alzheimers, amongst others) are unlikely to have a single causative gene. Thus, there is a pressing need for multi–SNP analysis methods that can reveal system-level differences in cases and controls. Here, we present a novel multi–SNP GWAS analysis method called Pathways of Distinction Analysis (PoDA). The method uses GWAS data and known pathway–gene and gene–SNP associations to identify pathways that permit, ideally, the distinction of cases from controls. The technique is based upon the hypothesis that, if a pathway is related to disease risk, cases will appear more similar to other cases than to controls (or vice versa) for the SNPs associated with that pathway. By systematically applying the method to all pathways of potential interest, we can identify those for which the hypothesis holds true, i.e., pathways containing SNPs for which the samples exhibit greater within-class similarity than across classes. Importantly, PoDA improves on existing single–SNP and SNP–set enrichment analyses, in that it does not require the SNPs in a pathway to exhibit independent main effects. This permits PoDA to reveal pathways in which epistatic interactions drive risk. In this paper, we detail the PoDA method and apply it to two GWAS: one of breast cancer and the other of liver cancer. The results obtained strongly suggest that there exist pathway-wide genomic differences that contribute to disease susceptibility. PoDA thus provides an analytical tool that is complementary to existing techniques and has the power to enrich our understanding of disease genomics at the systems-level

    Single nucleotide polymorphisms that modulate microRNA regulation of gene expression in tumors

    Full text link
    Genome-wide association studies (GWAS) have identified single nucleotide polymorphisms (SNPs) associated with trait diversity and disease susceptibility, yet the functional properties of many genetic variants and their molecular interactions remains unclear. It has been hypothesized that SNPs in microRNA binding sites may disrupt gene regulation by microRNAs (miRNAs), short non-coding RNAs that bind to mRNA and downregulate the target gene. While a number of studies have been conducted to predict the location of SNPs in miRNA binding sites, to date there has been no comprehensive analysis of how SNP variants may impact miRNA regulation of genes. Here we investigate the functional properties of genetic variants and their effects on miRNA regulation of gene expression in cancer. Our analysis is motivated by the hypothesis that distinct alleles may cause differential binding (from miRNAs to mRNAs or from transcription factors to DNA) and change the expression of genes. We previously identified pathways--systems of genes conferring specific cell functions--that are dysregulated by miRNAs in cancer, by comparing miRNA-pathway associations between healthy and tumor tissue. We draw on these results as a starting point to assess whether SNPs in genes on dysregulated pathways are responsible for miRNA dysregulation of individual genes in tumors. Using an integrative analysis that incorporates miRNA expression, mRNA expression, and SNP genotype data, we identify SNPs that appear to influence the association between miRNAs and genes, which we term "regulatory QTLs (regQTLs)": loci whose alleles impact the regulation of genes by miRNAs. We describe the method, apply it to analyze four cancer types (breast, liver, lung, prostate) using data from The Cancer Genome Atlas (TCGA), and provide a tool to explore the findings

    Fasano-Franceschini Test: an Implementation of a 2-Dimensional Kolmogorov-Smirnov test in R

    Full text link
    The univariate Kolmogorov-Smirnov (KS) test is a non-parametric statistical test designed to assess whether a set of data is consistent with a given probability distribution (or, in the two-sample case, whether the two samples come from the same underlying distribution). The versatility of the KS test has made it a cornerstone of statistical analysis and is commonly used across the scientific disciplines. However, the test proposed by Kolmogorov and Smirnov does not naturally extend to multidimensional distributions. Here, we present the fasano.franceschini.test package, an R implementation of the 2-D KS two-sample test as defined by Fasano and Franceschini (Fasano and Franceschini 1987). The fasano.franceschini.test package provides three improvements over the current 2-D KS test on the Comprehensive R Archive Network (CRAN): (i) the Fasano and Franceschini test has been shown to run in O(n2)O(n^2) versus the Peacock implementation which runs in O(n3)O(n^3); (ii) the package implements a procedure for handling ties in the data; and (iii) the package implements a parallelized bootstrapping procedure for improved significance testing. Ultimately, the fasano.franceschini.test package presents a robust statistical test for analyzing random samples defined in 2-dimensions.Comment: 8 pages, 4 figure

    A minimal model of peripheral clocks reveals differential circadian re-entrainment in aging

    Full text link
    The mammalian circadian system comprises a network of cell-autonomous oscillators, spanning from the central clock in the brain to peripheral clocks in other organs. These clocks are tightly coordinated to orchestrate rhythmic physiological and behavioral functions. Dysregulation of these rhythms is a hallmark of aging, yet it remains unclear how age-related changes lead to more easily disrupted circadian rhythms. Using a two-population model of coupled oscillators that integrates the central clock and the peripheral clocks, we derive simple mean-field equations that can capture many aspects of the rich behavior found in the mammalian circadian system. We focus on three age-associated effects which have been posited to contribute to circadian misalignment: attenuated input from the sympathetic pathway, reduced responsiveness to light, and a decline in the expression of neurotransmitters. We find that the first two factors can significantly impede re-entrainment of the clocks following a perturbation, while a weaker coupling within the central clock does not affect the recovery rate. Moreover, using our minimal model, we demonstrate the potential of using the feed-fast cycle as an effective intervention to accelerate circadian re-entrainment. These results highlight the importance of peripheral clocks in regulating the circadian rhythm and provide fresh insights into the complex interplay between aging and the resilience of the circadian system
    • …
    corecore